Part 1: {riskmetric} & {riskassessment} - a mini series for end-to-end R package validation

Eric Milliman

Disclaimer




Any opinions expressed in this presentation and on the following slides are solely those of the presenter and do not necessarily reflect those sponsoring the work

  • Group of ~50 companies (mostly pharma & biotech)

  • Mission: Enable the use of R by the Bio-Pharmaceutical industry in a regulatory setting, where the output may be used in submissions to regulatory agencies.

Two tools: what do they do



is a framework to quantify an R package’s “risk” by assessing several meaningful metrics designed to evaluate package development best practices, code documentation, community engagement, and development sustainability.


is a full-fledged R package containing a shiny front-end that augments the utility of {riskmetric}. The application’s goal is to provide a central hub for an organization to review and assess the risk of R packages, providing handy tools and guide rails along the way.

A primer on risk

  • Risk is a combination of quality and intended use and culture
    • Biogen has 3 levels of risk: death, rev/temp harm or no harm to patients
  • Quality is one way to mitigate risk
  • High quality software can still be high risk

Criteria to quantify risk


Sometimes “quality” is measurable! Software dev best practices dictate an R-package should have:

  • A license
  • Source code available for browsing
  • An easy to contact maintainer
  • A place to report bugs
  • Evidence that new bugs are being addressed
  • Complete Function documentation
  • Test coverage
  • Community usage

18 total assessments (to date)!

Getting started

library(dplyr)
library(tidyr)
devtools::load_all("~/riskmetric/")

(pkgRef <- pkg_ref(c("~/package_sources/accrual/", "dplyr", "tools", "arules", "limma")) %>%
  pkg_assess() %>%
  pkg_score())
# A tibble: 5 × 22
  package version pkg_ref             pkg_score news_current has_vignettes
  <chr>   <chr>   <lst_f_p_>              <dbl> <pkg_scor>   <pkg_scor>   
1 accrual 1.3     accrual<source>         0.846 0            0            
2 dplyr   1.1.2   dplyr<install>          0.463 1            1            
3 tools   4.2.0   tools<install>          0.773 0            0            
4 arules  1.7-6   arules<cran_remote>     0.455 1            1            
5 limma   3.56.2  limma<bioc_remote>      0.812 0            1            
# ℹ 16 more variables: size_codebase <pkg_scor>,
#   has_bug_reports_url <pkg_scor>, bugs_status <pkg_scor>, license <pkg_scor>,
#   export_help <pkg_scor>, reverse_dependencies <pkg_scor>,
#   downloads_1yr <pkg_scor>, dependencies <pkg_scor>, has_website <pkg_scor>,
#   r_cmd_check <pkg_scor>, remote_checks <pkg_scor>,
#   has_maintainer <pkg_scor>, exported_namespace <pkg_scor>,
#   has_news <pkg_scor>, has_source_control <pkg_scor>, …

The internals of riskmetric

  • pkg_ref (class) and pkg_ref_cache (func)
    • Collects metadata from different sources
    • Stores raw metadata
    • Lazy evaluation of metadata

The internals of riskmetric

  • pkg_ref (class) and pkg_ref_cache (func)
    • Collects metadata from different sources
    • Stores raw metadata
    • Lazy evaluation of metadata
  • pkg_assess (class) and assess\_\* (function)
    • Tabular summary of metadata
    • e.g. Number or Errors/Warnings/Notes from R CMD check

The internals of riskmetric

  • pkg_ref (class) and pkg_ref_cache (func)
    • Collects metadata from different sources
    • Stores raw metadata
    • Lazy evaluation of metadata
  • pkg_assess (class) and assess\_\* (function)
    • Tabular summary of metadata
    • e.g. Number or Errors/Warnings/Notes from R CMD check
  • pkg_metric (class) and metric_score (func)
    • Per assessment score
    • [0,1] bound
    • Multiple metrics per assessment possible

The internals of riskmetric

  • pkg_ref (class) and pkg_ref_cache (func)
    • Collects metadata from different sources
    • Stores raw metadata
    • Lazy evaluation of metadata
  • pkg_assess (class) and assess\_\* (function)
    • Tabular summary of metadata
    • e.g. Number or Errors/Warnings/Notes from R CMD check
  • pkg_metric (class) and metric_score (func)
    • Per assessment score
    • [0,1] bound
    • Multiple metrics per assessment possible
  • pkg_score (class/func)
    • Summary of metric scores
    • [0,1] bound
    • Can customize weights of custom metrics

Things to consider when using {riskmetric}

  • Package source
    • Not all metrics are available for all types of package sources
    • `pkg_ref()` can generate a list of mixed source's 

Things to consider when using {riskmetric}

  • Package source
    • Not all metrics are available for all types of package sources
    • `pkg_ref()` can generate a list of mixed source's 
  • Missing information
    • Metric only implemented for pkg_ref’s of specific source type
      • e.g. code coverage only available for pkg_ref.pkg_source
    • Parsing error – generally handled by {riskmetric}
    • Metric not expected because metadata missing
      • e.g. package does not have bug_reporting_url. Would not expect to have compute a bug closures over the last 30 days.

Things to consider when using {riskmetric}

  • Package source
    • Not all metrics are available for all types of package sources
    • `pkg_ref()` can generate a list of mixed source's 
  • Missing information
    • Metric only implemented for pkg_ref’s of specific source type
      • e.g. code coverage only available for pkg_ref.pkg_source
    • Parsing error – generally handled by {riskmetric}
    • Metric not expected because metadata missing
      • e.g. package does not have bug_reporting_url. Would not expect to have compute a bug closures over the last 30 days.
  • Metric weights
    • Weights are applied to all pkg_metric regardless of source type.
    # A tibble: 5 × 5
      pkg_ref             version remote_checks r_cmd_check size_codebase
      <lst_f_p_>          <chr>   <pkg_scor>    <pkg_scor>  <pkg_scor>   
    1 accrual<source>     1.3     NA            NA          0.25510204   
    2 dplyr<install>      1.1.2   NA            NA                  NA   
    3 tools<install>      4.2.0   NA            NA          0.01213789   
    4 arules<cran_remote> 1.7-6    1            NA                  NA   
    5 limma<bioc_remote>  3.56.2  NA            NA                  NA   

Current Roadmap

  1. Increase ease of use
    1. Convenient wrapper functions
    2. helpful messaging
    3. Cleaner reporting/output

Current Roadmap

  1. Increase ease of use
    1. Convenient wrapper functions
    2. helpful messaging
    3. Cleaner reporting/output
  2. Completeness
# A tibble: 18 × 8
   generic     default pkg_ref pkg_remote pkg_install pkg_source pkg_bioc_remote
   <chr>         <dbl>   <dbl>      <dbl>       <dbl>      <dbl>           <dbl>
 1 assess_new…      NA       1          1          NA         NA              NA
 2 assess_has…      NA       1         NA          NA         NA              NA
 3 assess_siz…       1      NA         NA           1          1              NA
 4 assess_has…       1      NA         NA          NA         NA              NA
 5 assess_exp…      NA      NA          1           1          1              NA
 6 assess_rev…       1      NA         NA          NA         NA              NA
 7 assess_dow…      NA       1         NA          NA         NA              NA
 8 assess_dep…       1      NA         NA           1          1               1
 9 assess_r_c…       1      NA         NA          NA          1               1
10 assess_rem…       1      NA         NA          NA         NA               1
11 assess_exp…       1      NA         NA           1          1              NA
12 assess_has…      NA       1         NA          NA         NA              NA
13 assess_cov…       1      NA         NA          NA          1              NA
14 assess_las…       1      NA         NA          NA         NA              NA
15 assess_lic…       1      NA         NA          NA         NA              NA
16 assess_has…       1      NA         NA          NA         NA              NA
17 assess_has…       1      NA         NA          NA         NA              NA
18 assess_has…       1      NA         NA          NA         NA              NA
# ℹ 1 more variable: pkg_cran_remote <dbl>

Current Roadmap

  1. Increase ease of use
    1. Convenient wrapper functions
    2. helpful messaging
    3. Cleaner reporting/output
  2. Completeness
    1. Consistency in source -> assessment -> metric
    2. Chain sources to increase metric coverage for analysis

Current Roadmap

  1. Increase ease of use
    1. Convenient wrapper functions
    2. helpful messaging
    3. Cleaner reporting/output
  2. Completeness
    1. Consistency in source -> assessment -> metric
    2. Chain sources to increase metric coverage for analysis
  3. 3rd party metrics
    1. Assessments/Metrics that are executed only if package dependencies are installed
      1. oysteR, srr, autotest, pkgnet, valtools
    2. ad hoc assessments and/or metrics

Current Roadmap

  1. Increase ease of use
    1. Convenient wrapper functions
    2. helpful messaging
    3. Cleaner reporting/output
  2. Completeness
    1. Consistency in source -> assessment -> metric
    2. Chain sources to increase metric coverage for analysis
  3. 3rd party metrics
    1. Assessments/Metrics that are executed only if package dependencies are installed
      1. oysteR, srr, autotest, pkgnet, valtools
    2. ad hoc assessments and/or metrics
  4. Cohorts – collections of packages
    1. A set of packages (e.g. tidyverse)
    2. An environment (base, priority, plus packages required by the business)

How to contribute

What makes a good metric

  1. Self-contained
    1. License compatibility with dependencies

What makes a good metric

  1. Self-contained
    1. License compatibility with dependencies
  2. Is environment agnostic
    1. R CMD check

What makes a good metric

  1. Self-contained
    1. License compatibility with dependencies
  2. Is environment agnostic
    1. R CMD check
  3. Has clear interpretation
    1. Version release frequency

What makes a good metric

  1. Self-contained
    1. License compatibility with dependencies
  2. Is environment agnostic
    1. R CMD check
  3. Has clear interpretation
    1. Version release frequency
  4. Can be represented numerically

A new package and resource {riskscore}

A new package and resource {riskscore}

  • currently in alpha on github
    • Needs to be QC’d
    • make sure API rate limits haven’t created missing values or errors
    • plans to add df of assessments

A new package and resource {riskscore}

  • currently in alpha on github
    • Needs to be QC’d
    • make sure API rate limits haven’t created missing values or errors
    • plans to add df of assessments
  • only covers CRAN using the pkg_cran reference source

A new package and resource {riskscore}

  • currently in alpha on github
    • Needs to be QC’d
    • make sure API rate limits haven’t created missing values or errors
    • plans to add df of assessments
  • only covers CRAN using the pkg_cran reference source
  • A community resource
    • help to contextualize riskscores for users
    • Will also help dev team (scoring changes, edge cases, etc)

A new package and resource {riskscore}

  • currently in alpha on github
    • Needs to be QC’d
    • make sure API rate limits haven’t created missing values or errors
    • plans to add df of assessments
  • only covers CRAN using the pkg_cran reference source
  • A community resource
    • help to contextualize riskscores for users
    • Will also help dev team (scoring changes, edge cases, etc)
  • Not a replacement for doing your own riskassessment

Some insights from {riskscore}

Some insights from {riskscore}

Missingness

Binary Metrics

Pakcage risk correlates with metrics

Metrics with continuous scores

Metric score correlation with risk score

Package clusters

Dev Team

  • Current Contributors
    • Eli Miller - Artorus
    • Sheng Wei - J&J
    • Sam Parsam - Pfizer
    • Narayan Iyer - Pfizer
    • Andrew Borgman - Biogen
  • Past Contributors
    • Doug Kelkhoff - Genentech
    • Yilong Zhang - Meta
    • Marly Cormar - Apple
    • Kevin Kunzmann - Boehringer Ingelheim






Thank you!